NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

Wu, Di; Gu, Jia-Chen; Yin, Fan; Peng, Nanyun; Chang, Kai-Wei (November 2024, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing)

Full Text Available
Red Teaming Language Model Detectors with Language Models

Shi, Zhouxing; Wang, Yihan; Yin, Fan; Chen, Xiangning; Chang, Kai-Wei; Hsieh, Cho-Jui (October 2024, Volume: Transactions of the Association for Computational Linguistics (TACL))

Full Text Available
Red Teaming Language Model Detectors with Language Models

https://doi.org/10.1162/tacl_a_00639

Shi, Zhouxing; Wang, Yihan; Yin, Fan; Chen, Xiangning; Chang, Kai-Wei; Hsieh, Cho-Jui (January 2024, Transactions of the Association for Computational Linguistics)

The prevalence and strong capability of large language models (LLMs) present significant safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive usage of LLMs, recent work has proposed algorithms to detect LLM-generated text and protect LLMs. In this paper, we investigate the robustness and reliability of these LLM detectors under adversarial attacks. We study two types of attack strategies: 1) replacing certain words in an LLM’s output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation. In both strategies, we leverage an auxiliary LLM to generate the word replacements or the instructional prompt. Different from previous works, we consider a challenging setting where the auxiliary LLM can also be protected by a detector. Experiments reveal that our attacks effectively compromise the performance of all detectors in the study with plausible generations, underscoring the urgent need to improve the robustness of LLM-generated text detection systems. Code is available at https://github.com/shizhouxing/LLM-Detector-Robustness
more » « less
Full Text Available
Highly scalable maximum likelihood and conjugate Bayesian inference for ERGMs on graph sets with equivalent vertices

https://doi.org/10.1371/journal.pone.0273039

Yin, Fan; Butts, Carter T. (August 2022, PLOS ONE)
De Vico Fallani, Fabrizio (Ed.)
The exponential family random graph modeling (ERGM) framework provides a highly flexible approach for the statistical analysis of networks (i.e., graphs). As ERGMs with dyadic dependence involve normalizing factors that are extremely costly to compute, practical strategies for ERGMs inference generally employ a variety of approximations or other workarounds. Markov Chain Monte Carlo maximum likelihood (MCMC MLE) provides a powerful tool to approximate the maximum likelihood estimator (MLE) of ERGM parameters, and is generally feasible for typical models on single networks with as many as a few thousand nodes. MCMC-based algorithms for Bayesian analysis are more expensive, and high-quality answers are challenging to obtain on large graphs. For both strategies, extension to the pooled case—in which we observe multiple networks from a common generative process—adds further computational cost, with both time and memory scaling linearly in the number of graphs. This becomes prohibitive for large networks, or cases in which large numbers of graph observations are available. Here, we exploit some basic properties of the discrete exponential families to develop an approach for ERGM inference in the pooled case that (where applicable) allows an arbitrarily large number of graph observations to be fit at no additional computational cost beyond preprocessing the data itself. Moreover, a variant of our approach can also be used to perform Bayesian inference under conjugate priors, again with no additional computational cost in the estimation phase. The latter can be employed either for single graph observations, or for observations from graph sets. As we show, the conjugate prior is easily specified, and is well-suited to applications such as regularization. Simulation studies show that the pooled method leads to estimates with good frequentist properties, and posterior estimates under the conjugate prior are well-behaved. We demonstrate the usefulness of our approach with applications to pooled analysis of brain functional connectivity networks and to replicated x-ray crystal structures of hen egg-white lysozyme.
more » « less
Full Text Available
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation

https://doi.org/10.18653/v1/2022.emnlp-main.440

Yin, Fan; Li, Yao; Hsieh, Cho-Jui; Chang, Kai-Wei (December 2022, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing)

Full Text Available
On the Sensitivity and Stability of Model Interpretations in NLP

https://doi.org/10.18653/v1/2022.acl-long.188

Yin, Fan; Shi, Zhouxing; Hsieh, Cho-Jui; Chang, Kai-Wei (January 2022, On the Sensitivity and Stability of Model Interpretations in NLP)

Recent years have witnessed the emergence of a variety of post-hoc interpretations that aim to uncover how natural language processing (NLP) models make predictions. Despite the surge of new interpretation methods, it remains an open problem how to define and quantitatively measure the faithfulness of interpretations, i.e., to what extent interpretations reflect the reasoning process by a model. We propose two new criteria, sensitivity and stability, that provide complementary notions of faithfulness to the existed removal-based criteria. Our results show that the conclusion for how faithful interpretations are could vary substantially based on different notions. Motivated by the desiderata of sensitivity and stability, we introduce a new class of interpretation methods that adopt techniques from adversarial robustness. Empirical results show that our proposed methods are effective under the new criteria and overcome limitations of gradient-based methods on removal-based criteria. Besides text classification, we also apply interpretation methods and metrics to dependency parsing. Our results shed light on understanding the diverse set of interpretations.
more » « less
Full Text Available
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation

Yin, Fan; Li, Yao; Hsieh, Cho-Jui; Chang, Kai-Wei (January 2022, Empirical Methods in Natural Language Processing (EMNLP))

Full Text Available
On the Sensitivity and Stability of Model Interpretations

Yin, Fan; Shi, Zhouxing; Hsieh, Cho-Jui; Chang, Kai-Wei (January 2022, Annual Meeting of the Association for Computational Linguistics (ACL))

Full Text Available
Geographical patterns of social cohesion drive disparities in early COVID infection hazard

https://doi.org/10.1073/pnas.2121675119

Thomas, Loring J.; Huang, Peng; Yin, Fan; Xu, Junlan; Almquist, Zack W.; Hipp, John R.; Butts, Carter T. (March 2022, Proceedings of the National Academy of Sciences)

The uneven spread of COVID-19 has resulted in disparate experiences for marginalized populations in urban centers. Using computational models, we examine the effects of local cohesion on COVID-19 spread in social contact networks for the city of San Francisco, finding that more early COVID-19 infections occur in areas with strong local cohesion. This spatially correlated process tends to affect Black and Hispanic communities more than their non-Hispanic White counterparts. Local social cohesion thus acts as a potential source of hidden risk for COVID-19 infection.
more » « less
Full Text Available
On the Robustness of Language Encoders against Grammatical Errors

https://doi.org/10.18653/v1/2020.acl-main.310

Yin, Fan; Long, Quanyu; Meng, Tao; Chang, Kai-Wei (January 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics)

We conduct a thorough study to diagnose the behaviors of pre-trained language encoders (ELMo, BERT, and RoBERTa) when confronted with natural grammatical errors. Specifically, we collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. We use this approach to facilitate debugging models on downstream applications. Results confirm that the performance of all tested models is affected but the degree of impact varies. To interpret model behaviors, we further design a linguistic acceptability task to reveal their abilities in identifying ungrammatical sentences and the position of errors. We find that fixed contextual encoders with a simple classifier trained on the prediction of sentence correctness are able to locate error positions. We also design a cloze test for BERT and discover that BERT captures the interaction between errors and specific tokens in context. Our results shed light on understanding the robustness and behaviors of language encoders against grammatical errors.
more » « less
Full Text Available

« Prev Next »

Search for: All records